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ABSTRACT 

Poverty is seen as a disease that has infected many homes in different countries of the world which need urgent 
and immediate treatment for Millennium Development Goals to be achieved or actualised. This article presents 
the theoretical basis for binary response data set, as well as the empirical results of the analysisof demographic 
data obtained from various households in Gbagyi Community of the Federal Capital Territory, Abuja, 
Nigeria.The study examines the prevalence rate and possible causes of poverty in Dobi, Gwako and Bako 
communities in Gwagwalada Area Council with the application of Logistic Regression Model. The empirical 
analysis reveal that socioeconomic status and level of education of household head are inversely related. It also 
shows a strong association between poverty and level of income of household head. While age, size, assets of 
household head and other demographic variables considered show various levels of insignificance in the 
estimated model. Ageing increases the likelihood of poverty while literacy, family size, sex decreases the chance 
.Based on the empirical results, the researcher recommends the establishment of more public schools so as to 
increase their level of education, provision of basic social amenities as well as encourage family planning 
among rural dwellers so as to serve as booster of socio economic status and prevent poverty trap in time. 
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I. INTRODUCTION 

Logistic regression is part of a category of statistical models called generalized linear models which 
includes ordinary regression, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA) and 
lognormal regression (Agresti, 1996). Logistic regression is a standard tool for modelling effects and interactions 
with binary response data (Park & Hastie, 2007). It makes possible the prediction of a discrete outcome from a 
set of variables that may be continuous, discrete, dichotomous, or a mix of any of these using the most 
parsimonious model. Logistic regression or linear probability model as popularly called gives the conditional 
probability that an event will occur given the values of the regressors as well as providing the knowledge of the 
relationships and strengths among variables, (Wright, 1995; Hyun & Ditto, 2006; Park & Hastie, 2007).Healy 
(2006) opined that Logistic regression is based on the log of the odds of a particular event occurring with a 
given set of observations. Its underlying principles are based on probabilities and the nature of the log curve. 
The only assumptions of Logistic regression are that the resulting logit transformations are linear, the dependent 
variable is dichotomous and that the resultant logarithm curve doesn't include outliers. Hence Normality 
assumptions such as homogeneity of variance, observations and disturbance terms are normally distributed and 
all normality tests are invalid and Ordinary Least Square (OLS) assumptions break down due to dichotomous 
nature of dependent data. Logistic Regression is preferred by many researchers in the analytical fields due to its 
robust practical nature , intuitive assumptions and its ability to produce a predictive representation of the real 
world situations, (Healy,2006). 

Logit model has been used extensively in analysing growth phenomena, such as population, GDP and 
money supply (Rrammer, 1991); It has also featured in manufacturing and health related studies, (Healy, 2006); 
Recreational activities (Hyun &Ditton, 2006);Examination results(Saha.201 l);determinants of Poverty (Achia et 
al,2010)etc The researcher apply a common theme of the theory of Logit models by placing an individual into 
distinct categories or groups rather than along a continuum As a result of the various usefulness of this all 
encompass model that has a non linear relationship between the response and the predictor variables. It got 
applications in various field of studies which includes epidemiology studies, demography, social sciences 
among others. 
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Hence the research work aims at using axiom and the concept in question to find the possible statistical 
relationship between poverty and annual household income, level of education of household head, physical 
assets, sex of household head, and size of household among other variables possibly contributing to poverty in 
Gbagyi community of Federal Capital Territory Abuja Nigeria. 

IT. THEORETICAL AND CONCEPTUAL FRAME WORK 

Poverty is a global phenomenon and is dynamical in nature with many facets, (World Bank, 2000; World 
Development Report, 2001 ; NEEDS, 2004; Achia et al, 2010). 

Poverty is the inability to retain a minimal standard of living measured in terms of basic consumption needs or 
some income required for satisfying them (World Bank, 2000). 

According to contemporary English dictionary poverty is defined as state of being poor, lack quality or state of 
being inferior. 

Poverty is a multidimensional phenomenon (World Bank, 2000). Its various dimensions includes : lack of 
opportunities , lack of empowerment, lack of security, low access to health care facilities and other social 
infrastmctural facilities which make the poor citizens vulnerable to diseases, violence and so on. 
Poverty is dynamic and has many dimensions (NEEDS,2004). People may get into poverty as a result of natural 
disasters or health problems, lack of access to credit , or the lack of natural resources .Poor people are more 
likely to live in rural areas, be less educated and have larger families than the rest of the population. 
According to NEEDS (2004), Poverty has many causes, all of which reinforce one another. One source of 
poverty is lack of basic amenities /services such as clean water, education and health care. Another is lack of 
assets, such as land, tools, credits and supportive networks of friends and family. A third is lack of income, 
including food, shelter, clothing and empowerment. 

Households are poor and inequality in the distribution of incomes, assets, amenities and other social services 
prevails, (NEEDS, 2004; Okunmadewa et al., 2005). 

See Achia et al, 2010; Saha, 2011 for extensive literature and referencing on Poverty, demographic variables, 
determinants of poverty in developing countries and logistic regression models among others. 
There are so many assertions made about poverty in the continent of Africa where Scholars believed that 
Poverty killed faster than HIV/ AIDS and some identified poverty as the possible cause(s) of unlimited uprising, 
unrest and civil disobedience in the various regions. 

Here, we seek to determine the trends and key determinants of poverty using various demographic variables in 
some hamlets of Gbagyi Community in the Federal Capital Territory Abuja, Nigeria, West Africa with the 
application of Logistic Regression Model. 

2.1 THE MODEL 

Regression model for binary response variables is used to describe the population proportions of 
occurrence of an event. The population proportion of successes represents the probability p(y = 1) for a 
randomly selected subject. This probability varies according to the values of the explanatory variables. Models 
for binary data which are dichotomous in nature assume a binomial distribution which are well described in 
Hollander & Wolfe (1973:Ppl5); Whittle, (1976); Krzanowski, (1998:Ppl8); Gujarati, (2004:Pp583); Spiegel et 
al, (2004); and Awogbemi&Oguntade, (2012) for the dependent variable. Let y represent a dichotomous 

random variable, and then the binary response variable y has two categories denoted by 1 and O.That is, the 

dependence variable can take the value 1 with a probability of success A , or the value 0 with probability of 

failure 1 — X .This type of variable is called a Bernoulli or binary variable 

Let p(y=l)=l-p(y = 0) = Z (1) 

Where /I is defined by the equation (2) 

^(a+/3 l x 1 +/3 2 x 2 +...+/3 n x n ) 

2 _ ^ rj) 

\ _|_ ^'(« + A Jc l+/3 2 JC 2 +...+y9„j:„) ^ ' 

That is (2) can be succinctly written as 

A = — = (3) 

l + £ 

Where OL serves as the bench mark for the equation and /?. is the coefficient of the exogenous variables X i for 
i = 1, 2,...,n. 
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The first task in the model estimation is to transform the independent variable and determine the 
coefficients of the independent variables, (Healy, 2006). The basic logistic regression analysis begins with logit 
transformation of the dependent variable through utilization of maximum likelihood estimation. This is done 
using what is popularly known as Odds Ratio. The odds ratio for an event is represented as the probability of the 
event outcome divided by one minus probability of event outcome. 
The odds ratio is given by 



Odds = 



l-A(x) (4) 



Since from (3) 

l+u 

_ . , A(x) o 1 
therefore Odds — 



But, 1 • 



I-A(jc) l + u j 
v l+u—u 1 



v 



l + u 



l + u l + u l+u 



logit[/l(x)] = ln 



= a + fi l x l + fi 2 x 2 +... + P n x n (5) 



^ , , u l + u 

Therefore, Odds = . = U 

l + U 1 

Where : A(x) is the probability of success i.e. an event occurring, and 
1 — A(x) is the probability of failure i.e. an event not occurring. 

a + fi\ x \ + /^2 JC 2 + ■• + @n x n represent the regression model. 

Hence equation (4) can be transformed into an alternative form of logistic regression equation by taking the 
Naperian logarithm of the odds ratio popularly known as logistic transformation (Logit) to obtain equation (5) 

I-A(jc) 

(For elaborate details and extensive referencing on Logistic regression models, See: Krammer, (1991), 
Krzanowski, (1998:Ppl89); Gujarati (2004), Healy, (2006), Park & Hastie, (2007)to mention but a few scholars 
)■ 

WALD TEST OF SIGNIFICANCE FOR THE MODEL PARAMETER 

To determine the significance of the independent variables we can use either the Wald statistic or the 
likelihood ratio test, (Healy, 2006). The Wald statistic is a method to test whether the coefficients are 
significantly different from zero. It is used to test the statistical significance of each coefficient (P) in the model. 
A Wald test calculates a Z statistic: 



se P, 



This Z value is then squared, yielding a Wald statistic with a chi-square distribution. 

2.3 LIKELIHOOD RATIO TEST OF INDEPENDENCE 

The likelihood-ratio test uses the ratio of the maximized value of the likelihood function for the full 
model (L,) over the maximized value of the likelihood function for the simpler model (L 0 ). The likelihood-ratio 
test statistic equals: 



^ = ^-21og(^) = -2[log(L 0 )-log(Z 1 )] = -2(L 0 -Z 1 ) 

Se /3, I 

This log transformation of the likelihood functions yields a chi-squared statistic. 
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III. EMPERICAL ANALYSIS 

3.1 THE DATA 

Random samples of 250 indigenes in Gbagyi communities of Gwagwalada Area Council (GAC) 
Abuja, Nigeria, were selected for the study. The selected hamlets includes: Gwako, Dobi, and Passo -Gbagyi 
communities in FCT Abuja. The data gathered were collected using self designed and administered 
questionnaires and the process was consistence with Hyun and Ditto, 2006. 

3.2 RESULTS AND DISCUSSION 

Table 1: Model information 



Table 1: Model information 





B 


S.E. 


Wald 


Df 


Sig. 


Exp(B) 


95.0%C.I.forEXP(B) 


Lower 


Upper 


Step l a Age 


.011 


.013 


.656 


1 


.418 


1.012 


.963 


1.016 


Size 


-.038 


.046 


.705 


1 


.401 


.962 


.880 


1.052 


Sex(l) 


-.333 


.453 


.538 


1 


.463 


.717 


.295 


1.744 


Education(l) 


-.171 


.004 


7.405 


1 


.040 


.843 


.833 


.853 


Landlord(l) 


.203 


.315 


.412 


1 


.521 


1.224 


.660 


2.272 


Income(l) 


1.258 


.335 


14.096 


1 


.000 


3.517 


1.824 


6.780 


Migrant(l) 


-.402 


.312 


1.660 


1 


.198 


.669 


.363 


1.233 


Marriage(l) 


.475 


.321 


2.182 


1 


.140 


1.608 


.856 


3.018 


Constant 


.879 


.882 


.992 


1 


.319 


2.408 







a. Variable(s) entered on step 1: Age, Size, Sex, Education, Landlord, Income, Migrant, Marriage. 



From table 1 above, variables income and education significantly determine the poverty level in the said 
communities while all other variables including the drift are not statistically significant at 5% level of 
significance. Thus, the regression model reduced to equation (6). 

y/l = 1.258X m ~0.171X, „ m (6) 

mcome(\) educational) v ' 

Hence, the fitted Logistic regression Model is 

p\ .258X incl>me(l) — 0. 171X edtlcal i onrl - ) 

InF/l = — 



Log Likelihood is obtained as: 
-2LogLikelihood= Reduced Model-Full Model 
=288.925-269.209=21.482 



The result shows that in Gbagyi communities under study, the level of education of household head is 
inversely related with the incidence of poverty in the community. Thus, an increase in education attainment has 
an important impact on reducing the probability that a household is poor as the variable is statistically 
significance as shown in Tablel. Likewise, income as a demographic variable is also significant in determining 
poverty status in the estimated logistic regression model; hence, level of income as a measure of per capita 
income of household head of the said populace is highly associated with social economic status of Gbagyi 
Communities. While poverty and other demographical factors considered in the said Gbagyi communities are 
not a function of each other as variables have nothing to do with prevalence of poverty in the community going 
by the outcome of the analysis of the data gathered. The odd ratio of literacy to illiteracy is 0.843; odd ratio of 
high income to low income is 3.517. The computed odds ratio of data obtained in Table 1 indicates that ageing 
increases the probability of poverty, family size decrease probability of poverty, Literacy decrease probability of 
poverty, low income increases probability of poverty. A sample population has a decrease probability of having 
poverty if the estimated odds ratio of its variable(s) is less than one and if the odds ratio is greater than one then 
there will be an increased probability of poverty prevalence. 
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Table2:Hosmer and Lemeshow Goodness of Fit Test 



Step 


Chi-square 


Df 


Sig. 


1 


15.136 


8 


.057 



From Hosmer and Lemeshow test, the data provides a good fit to the model estimates, as theHosmer statistics 
gives a non significance Chi- square value of 15.136 at 5% level of significance 



Table 3: Omnibus Tests of Model Fit Coefficients (Likelihood Ratio test) 





Chi-square 


Df 


Sig. 


Step 1 


Step 


21.482 


8 


.006 




Block 


21.482 


8 


.006 




Model 


21.482 


8 


.006 



-21og Likelihood = 21.482 which is significance at 0.05 level of significance. This implies that at least one of the 
parameters is not equal to Zero and the model is well fitted. Thus, the full model is significant as shown by - 
2LogLikelihood statistic. 
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